Day 09 梯度和自動微分

2024 iThome 鐵人賽

DAY 9

自我挑戰組

30天初探tensorflow之旅系列第 9 篇

16th鐵人賽

tomoiris

2024-09-23 17:49:15

173 瀏覽

分享至

自動微分的功能是在 TensorFlow 2.0 中大幅被改進，引入了 tf.GradientTape，讓這項操作更直觀。像 TensorFlow 1.0 就需要顯示定義計算圖，並使用 tf.Session 來計算，自動微分就相對不方便。

先來看看這兩個的定義是什麼:
1.梯度:
指函數在某一點的導數向量，表該點上函數變化的方向和速率。在TensorFlow中，模型的損失函數是我們想最小化的，而梯度可以幫助我們調整模型的參數來減少損失。
2.自動微分:
自動微分是用來計算導數，通過記錄計算圖來自動計算函數的導數，和符號微分和數值微分不同，自動微分能夠以高效且精確的方式計算導數。

在TensorFlow 用 tf.GradientTape 來操作自動微分。可以在上下文中紀錄操作，接著計算損失函數的梯度。

第一個要介紹基本語法和高階梯度的結合:

import tensorflow as tf
def f(x):
    return x ** 4 + 3 * x + 2
x = tf.Variable(2.0)
with tf.GradientTape() as tape:
    y = f(x)
dy_dx = tape.gradient(y, x)
print(f"f(x) = {y}")
print(f"f'(x) = {dy_dx}")

會得到:

f(x) = 24.0
f'(x) = 35.0

第二個要介紹自定義操作梯度:

@tf.custom_gradient
def my_square(x):
    y = x ** 2
    def grad(dy):
        return 2 * x * dy
    return y, grad
x = tf.Variable(3.0)
with tf.GradientTape() as tape:
    y = my_square(x)
dy_dx = tape.gradient(y, x)
print(dy_dx.numpy())

就可以得到輸出是6.0。

最後一個是反向傳播與優化:

import tensorflow as tf
import numpy as np
input_value = 10
samples = 100
train_data = np.random.rand(samples, input_value).astype(np.float32)
train_labels = np.random.rand(samples, 1).astype(np.float32)
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(input_dim,)),
    tf.keras.layers.Dense(1)
])
optimizer = tf.keras.optimizers.Adam()
epochs = 100
for epoch in range(epochs):
    with tf.GradientTape() as tape:
        predictions = model(train_data)
        loss = loss_fn(train_labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    if epoch % 10 == 0:
        print(f'Epoch {epoch}: Loss = {loss.numpy()}')

這些程式碼可以直接運行，且會隨著訓練迭代輸出損失值，可以即時得到模型的訓練情況。